Concatenative speech synthesis for European Portuguese
نویسندگان
چکیده
This paper describes our on-going work in the area of text-tospeech synthesis, specifically on concatenative techniques. Our preliminary work consisted in investigating the current trends in concatenative synthesis and the problems that could arise when we apply the existing state-of-the art solutions to the specific case of European Portuguese. Our ultimate goal is to develop a text-to-speech system that could be trained for any speaker’s voice in a fully automatic way, i.e., we would like to develop a customized text-to-speech synthesizer for any voice reading a predetermined text. Our first steps in this direction involved such issues as automatic segmentation and alignment of recorded speech, optimized inventory design for concatenative synthesis, unit selection and optimal coupling of the selected units.
منابع مشابه
Simulation of Human Speech Production Applied to the Study and Synthesis of European Portuguese
A new articulatory synthesizer (SAPWindows), with a modular and flexible design, is described. A comprehensive acoustic model and a new interactive glottal source were implemented. Perceptual tests and simulations made possible by the synthesizer contributed to deepen our knowledge of one of the most important characteristics of European Portuguese, the nasal vowels. First attempts at incorpora...
متن کاملWFST Based Unit Selection for Concatenative Speech Synthesis in European Portuguese
The goal of the current work is to use Weighted Finite-State Transducers (WFSTs) to model the unit selection task, in a concatenative Text-to-Speech system. One of the major difficulties is the design of a perceptually meaningful cost function that weights and combines several features of the available inventory units, matching them to the target information. The WFST approach allows for great ...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کاملAiuruete: a high-quality concatenative text-to-speech system for brazilian portuguese with demisyllabic analysis-based units and a hierarchical model of rhythm production
Aiuruetê is a high-quality concatenative TTS system for Brazilian Portuguese. Its name (pronounced [!"#$%#&'(&]) illustrates the challenges we have fixed as a research paradigm: to feed the system with the specificities of our language, highlighted by an up-to-date discussion of the Phonology/Phonetics and prosody/segments interfaces, without a huge computational cost. The choice for the concat...
متن کاملArchisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese
A letter-to-phone conversion scheme is proposed for Portuguese which excludes representation of allophonic detail. Phonetically unstable segments are treated as archisegments, their articulatory weakness being analyzed in terms of feature underspecification. Besides solving classical problems of allophony and allomorphy, this analysis provides an efficient principle for building a unit inventor...
متن کامل